A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines

نویسندگان

چکیده

This article describes a standard API for set of Batched Basic Linear Algebra Subprograms (Batched BLAS or BBLAS). The focus is on many independent operations small matrices that are grouped together and processed by single routine, called routine. in uniformly sized groups, with just one group if all the equal size. aim to provide more efficient, but portable, implementations algorithms high-performance many-core platforms. These include multicore CPU processors, GPUs coprocessors, other hardware accelerators floating-point compute facility. As well as types double precision, we also half quadruple precision standard. In particular, used very large scale applications, such those associated machine learning.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Extended Set of Fortran Basic Linear Algebra Subprograms

This paper describes an extension to the set of Basic Linear Algebra Subprograms. The extensions are targeted at matrix-vector operations which should provide for efficient and portable implementations of algorithms for high performance computers. An Extended Set of Fortran Basic Linear Algebra Subprograms Jack J. Dongarra † Mathematics and Computer Science Division Argonne National Laboratory ...

متن کامل

HeteroPBLAS: A Set of Parallel Basic Linear Algebra Subprograms Optimized for Heterogeneous Computational Clusters

This paper presents a software library, called Heterogeneous PBLAS (HeteroPBLAS), which provides optimized parallel basic linear algebra subprograms for Heterogeneous Computational Clusters. This library is written on the top of HeteroMPI and PBLAS whose building blocks, the de facto standard kernels for matrix and vector operations (BLAS) and message passing communication (BLACS), are optimize...

متن کامل

A Proposal for a Set of Parallel Basic Linear Algebra Subprograms

This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms(PBLAS). The PBLAS are targeted at distributed vector-vector, matrix-vector and matrix-matrix operations with the aim of simplifying the parallelization of linear algebra codes, espe-cially when implemented on top of the sequential BLAS.At rst glance, because of the apparent simplicity of its s...

متن کامل

Towards Reversible Basic Linear Algebra Subprograms: A Performance Study

Problems such as fault tolerance and scalable synchronization can be efficiently solved using reversibility of applications. Making applications reversible by relying on computation rather than on memory is ideal for large scale parallel computing, especially for the next generation of supercomputers in which memory is expensive in terms of latency, energy, and price. In this direction, a case ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Mathematical Software

سال: 2021

ISSN: ['0098-3500', '1557-7295']

DOI: https://doi.org/10.1145/3431921